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A Random Matrix Theoretical Approach to Early Event Detection 

Using Experimental Data 

Yingshuang Cao^, Long Cai^, Robert C. Jie Gu\Xing He^Qian Ai\ and Zhijian Jin^ 

In this paper, High-dimensional data analysis methods are proposed to deal with random matrix which is composed by the real 
data from power network before and after the fault. The mean spectral radius (MSR) of non-Hermitian random matrices is defined 
as a statistic analytic for the fault detection. By analyzing the characteristics of random matrices and observing the changes of the 
spectral radius of random matrices, grid failure detection will be achieved. This paper describes the basic mathematical theory of 
this big data method, and the real-world data of a certain China power grid is used to verify the methods. 


Index Terms —Fault recognition and diagnosis. Big Data, spectral radius, random matrices and modern smart grid 


1. Introduction 

T he increasing data have become a strategic resource in 
smart grids.[1] These datasets contains valuable informa¬ 
tion and signals. Our team uses the theory of big data to detect 
fault through analysing these datasets. The result can be used 
to improve the safety and stability of modem power system. 

The amount of real-time information in power system 
increases quickly with the development of the smart grid. 
These data are mostly unstructured and come from different 
resources. In recent years, increasing data are collected from 
Phasor Measurement Units (PMUs), Intelligent Electronic 
Devices (lEDs), Supervisory Control and Data Acquisition 
(SCADA) and so on [2]. There is a complex relationship 
between these high-dimensional data. The current research 
mainly focused on determining fault detection signal and the 
detection signal treatment [3-4]. But when applied to complex 
power grid, these methods become inefficient and invalid. With 
the advent of the era of Big Data, the theory of big data has 
been applied to many fields [5-7]. It has been proven to be a 
good method to analyze these massive and high-dimensional 
data. In recent years. Big Data are also studied in power 
system analysis of faults and disturbances and some promising 
researches were made [8,9]. 

Eault detection is the keystone of our research and also 
the key to improve the safety and stability of modern power 
system. Eault detection requires continuous monitoring and 
processing of massive quantities of data in order to detect 
and identify emerging patterns, which means telling signals 
from noises. Eor smart grids, uncertain locations of PMUs, 
small random loads/generators fiuctuations and sample errors 
could be regarded as noises, whereas sudden changes of 
loads/generators, system faults, network reconfiguration as 
signals. It is a big challenge to respond within a tolerable 
elapsed time or hardware resources by extracting analysis from 
streaming data in the smart grids. 

The work of [8] is the most related work to our paper. 
Xie, Chen and Kumar used the principal component analysis 
(PCA) for early event detection with both simulations and 
experimental data. Our work is built upon their work but 
different from their work in a fundamental sense. PCA is a 
widely adopted approach in unsupervised machine learning. 
Eirst, they used the features extracted from the data in the 


training phase; second, they used the extracted features for 
early event detection. Their pioneering work motivates our 
research on one hand; we are motivated by the line of research 
in the spirit of [5-7,10], on the other hand. The central 
idea of our approach is to model a large power system as 
a high-dimensional statistical problem. So we can exploit 
the high-dimensionality of the massive datasets. The high- 
dimensionality of the data is a blessing, not a curse. The 
work of [13] is also related here. We only study the power 
grid using simulations in that paper. This current paper is 
regarding experimental data. The objectives of two papers are 
also different. 

In this paper, random matrix theory is used to model the 
real-life data from a certain power grid in China. This is es¬ 
sentially an anomaly detection problem in the literature of big 
data. It may be modeled as binary hypothesis testing: normal 
hypothesis Hq (no signal present) and anomaly hypothesis 
1-Li (signal present). Traditionally, we model the dataset in 
hypothesis Hi as a standard Gaussian noise random vector 
whose entries are Xi ^ CA/’(0,1), i = 1, • • •, n. In our real- 
life data, the traditional Gaussian model seems be not natural. 
We use the universality of random matrix theory to argue that 
the Gaussian random matrix can be used to model our real 
data, even though the data are non-Gaussian. This universality 
is only valid when the data dimension is high. This is in some 
sense like the central limit theorem. 

The anomaly hypothesis Hi is claimed when the data 
deviates from the normal hypothesis Ho- Our central analysis 
is focused on the latter. 

We use the measurement data without the knowledge of 
any other power grid information. In other words, our black 
box approach is model-free and purely data-driven, relying on 
the high dimensionality of the data. Abstract statistical laws in 
high dimension probability serves as the basis for our approach 
[5,6,7]. The unified feature of the abstract results have proved 
effective using real-life data. 

This paper is organized as follows. Section II is used 
to explain, principal component analysis and random matrix 
theory. Based on the theories in Section II, related algorithms 
and test sample are proposed in Section III. Conclusions and 
possible future research directions are presented in Section IV. 
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11. The Basic Principle of Random Matrix Theory 
And Principle Component Analysis 

A. Random Matrix Theory 

1) Marchenko-Pastur Law (MP Law) 

The MP Law describes the asymptotic behavior of singular 
values of large rectangular random matrices [11]. Let X = 
be a X T {N/T = c e (0,1)) random matrix whose 
entries are independent identically distributed (i.i.d.) random 
variables with mean /i = 0 and variance < oo. 

The empirical spectrum density (BSD) of the corresponding 
sample covariance matrix S = ( i.e. f{X{S)) ) 

converges to the distribution of MP Law with density function: 




, a < X < b 
, otherwise 


( 1 ) 


where a = cr^(l — = cr^(l + \/c)^,c = N/T. 

2) Kernel Density Estimation (KDE) 

A nonparametric estimate of the empirical spectral density 
of the sample covariance matrix is used 


where Xi{i = 1,2, are the eigenvalues of S, and 

AT (•) is the kernel function for bandwidth parameter h. 

3) The Single-Ring Law 

For each n > 1, let be a random matrix which admits 
the decomposition [12]: 


An — UfiTfiV^^ Tfi — diag(5i, • • •, 5n) (3) 

where Si are positive, and Un and Vn are two independent 
random unitary matrices which are Haar-distributed indepen¬ 
dently from Tn. In probability, the BSD of An converges 
weakly to a deterministic measure whose support is under 
certain mild conditions. Some outliers to single ring law are 
observed. 

Consider the matrices product Z = where Xu 

is the singular value equivalent of the rectangular N xT non- 
Hermitian random matrix X, whose entries are independent 
identically distributed (i.i.d.) variables with mean = 0 

and variance cF‘^{xk,:) = 1 for /c = (1, 2, • • •, X). The matrices 
product Z is converted to Z by a transform which make the 
variance to cr^{z:^k) = for /c = (1, 2, • • •, N). Thus, the 
empirical spectrum density of Z converges almost surely to 
the same limit given by 



(2/a-2) < 1^1 < 1 

, otherwise 


(4) 


as X, T ^ oo with the ratio N/T = c G (0,1]. On the 
complex plane of the eigenvalues, the inner circle radius is 
(1 — and outer circle radius is unity [13]. Moreover, 

S = ZZh is able to acquired and its BSD converges to the 
distribution of MP Law. 


4) Universality of the MP Law 

Akin to the central limit theorem, universality [5, page 347] 
refers to the phenomenon that the asymptotic distributions of 
various covariance matrices (such of eigenvalues and eigenvec¬ 
tors) are identical to those of Gaussian covariance matrices. 
These results let us calculate the exact asymptotic distribu¬ 
tions of various test statistics without restrictive distributional 
assumptions of matrix entries. The presence of the universality 
property suggests that high-dimensional phenomenon is robust 
to the precise details of the model ingredients [14]. For 
example, one can perform various hypothesis tests under the 
assumption that the matrix entries not Gaussian distributed but 
use the same test statistic as in the Gaussian case. 

The data of power grid below can be viewed as a spatial 
and temporal sampling of the random graph. Randomness is 
introduced by the uncertainty of spatial locations and the sys¬ 
tem uncertainty. Under real-life applications, we cannot expect 
the matrix entries follow i.i.d. distribution. Numerous studies 
based on both simulations [10] and experiments however, 
demonstrate that the MP law is universally followed. In such 
cases, universality properties provide a crucial tool to reduce 
the proofs of general results to those in a tractable special 
case—the i.i.d. case in our paper. 


B. Principal Component Analysis (PCA) 

A method based on PCA is proposed to testify our result. 
PCA is the most commonly used linear dimensionality reduc¬ 
tion methods. PCA reduces the dimensionality by preserving 
the most variance of original data. Due to the increasing size 
of PMU data, the dimensionality analysis of PMU data has 
been studied in recent literature. The analysis shows the rank 
of PMU data matrix is low. The massive PMU data essentially 
lie in a much reduced dimensional space. 

Let P denote the number of available PMUs across the 
whole power network, p denote the number of PMUs used 
in computation. Bach PMU provides I measurements. There 
are a number of PMUs in interconnected power network, with 
each PMU providing measurements of voltage magnitude and 
power flow. At each time sample, a total of p x / measurements 
are collected. In this paper, we conduct the analysis for each 
category of measurements independently. Deflne the measure¬ 
ment vector • • •, containing the p mea¬ 

surements. Use to denote the measurements of time t. Deflne 
the measurement matrix = [Vi+i, Vi+2>''' > Vi+r]> 

The PCA-based event detection at time ipT-\-l is described 
as follows [8]: 

1) Calculate the matrix of Cy = Z^xt (^*xt) • 

2) Calculate the T nonzero eigenvalues and vectors of Cy. 

3) Rearrange the T eigenvalues in decreasing order, with 
the eigenvectors being the principal components (PCs). 

4) Out of the X PCs, select the m largest eigenvalues and 
corresponding PCs. 

5) Form a new m-dimensional subspace from the m PCs. 

6) Project the vector onto the m-dimensional PC- 

based space. 

7) Calculate the length of the projection vector. 



JOURNAL OF ETeX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 


3 


III. Numerical Examples 

This case is based on the field data. The operation data of 
a certain power grid in China is collected. It includes the data 
of substation, breaker, line, bus, generator, frequency and so 
on. All of the PMU data are real-life data from five different 
interconnected power grids in China. Fig.l shows the structure 
of the data lists from our test power grid. The sampling interval 
is 1 min (s) and the sampling duration is 4320mins (3days). 

These data are exported from the power grid database 
and the time of grid failure is unknown in advance. After 
our analysis, the result has been compared with the real-life 
situation. At first the data of power fiow are analyzed by PC A 
method, but valid information isn’t got by this way (Fig.2). 
Then the output data of active power of all generators are 
placed in a random matrix in time series. For 10 samples 
(lOmins), we constitute a single random matrix and normalize 
each row of the matrix and calculate the mean spectral radius 
of every matrix. The mean spectral radius is regarded as a big 
data analytic index changes over time. 



Fig. 1: The structure of the data lists from a certain power 
grid in China 



Fig. 2: The results of load fiow analysis 

Fig. 3 shows the relationship between Marcenko-Pastur Law 
and Kernel Density Estimation. Their matching degree will 
decrease when the grid fails. At the same time, the blue 
curve represents the M-P Law and the red curve represents the 
empirical eigenvalue density. When the degree of coincidence 



(c) After fault 

Fig. 3: Empirical eigenvalue density functions of random 
matrices. The Marchenko-Pastur law is given in eq.(l) 


of the blue and red lines is higher, the matrix lines with the M- 
P Law better. Compared (a) with (b), the empirical eigenvalue 
density of the pre-fault matrix lines with the M-P Law much 
better than the fault matrix. In such a system, loads dramatic 
change or system failures break the random distribution of 
system data, i.e. i.i.d. Under statistics perspective, it leads to 
a fluctuation in a certain direction, as well as the deviation of 
M-P law for the system data. When other generators increase 
their output to maintain the stability of the grid, the random 
matrix will meet the i.i.d. gradually, so the blue and red lines 
close to each other again in (c). From a statistical perspective, 
the pre-fault describes the global behavior (the M-P law) of the 
power grid, while the fault tells the local behavior of this grid. 
We know by experience that the fault tends to occur locally. 
The M-P law is the asymptotic distribution when the data size 
grows to infinity. 

The mean spectral radius changes suddenly when a failure 
occurs as shown in Fig.4. Failure detection will be achieved 
through observing the relationship between the M-P Law 
and the empirical eigenvalue density of the matrix which is 
composed by the sampling data directly. Similar, the drastic 
change of the mean spectral radius indicates the occurrence of 
failure. The result is consistent with the results in PCA method 
shows in Fig.5. And they are very close to the real situation 
shows in Fig.6. Unlike the data we get from simulation, the 
output of all generators are always changing in the real power 
grid, when one generator fails, other generators will increase 
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Fig. 4: The average radius using time series for the events 



Fig. 5: Generator unit output analysis chart by PCA for the 
events 



Fig. 6: Power change of the faulty Generator 


their output to maintain the stability of the grid, so the mean 
spectral radius returns to the original position (Fig.4) earlier 
than when the fault is removed (Fig.6). 

IV. Conclusion 

A novel random matrix method is proposed to detect 
power grid failure. Compared with the widely used principal 
components, our method exploits the convergence of the 
empirical spectral density, a phenomenon arising from high¬ 
dimensional probability space. We compared two methods 
using the real-life datasets collected in China. It seems that 
two methods work well in the case of our system data. The 
next stage of research is to use sketching as a tool [15,16]. Our 
proposed sketching method projects the data set into a lower¬ 
dimensional subspace. Dimensionality reduction techniques, 
like e.g. principal component analysis, are commonly used 
in statistics. However, their focus is usually on reducing the 
number of variables. Our method aims to reduce the number 
of observations while keeping the algebraic structure of the 
data. This leads to a speed-up in the subsequent (frequentist 
or Bayesian) regression analysis, because the run-time of the 
common algorithms usually heavily depends on the data size. 
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